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ABSTRACT 

We model the luminosity-dependent projected and redshift-space two-point correla¬ 
tion functions (2PCFs) of the Sloan Digital Sky Survey (SDSS) Data Relese 7 Main 
galaxy sample, using the halo occupation distribution (HOD) model and the subhalo 
abundance matching (SHAM) model and its extension. All the models are built on 
the same high-resolution A-body simulations. We find that the HOD model generally 
provides the best performance in reproducing the clustering measurements in both 
projected and redshift spaces. The SHAM model with the same halo-galaxy relation 
for central and satellite galaxies (or distinct haloes and subhaloes), when including 
scatters, has a best-fitting y^/dof around 2-3. We therefore extend the SHAM model 
to the subhalo clustering and abundance matching (SCAM) by allowing the central 
and satellite galaxies to have different galaxy-halo relations. We infer the correspond¬ 
ing halo/subhalo parameters by jointly fitting the galaxy 2PCFs and abundances and 
consider subhaloes selected based on three properties, the mass Mace at the time of 
accretion, the maximum circular velocity 14cc at the time of accretion, and the peak 
maximum circular velocity Vpeak over the history of the subhaloes. The three sub¬ 
halo models work well for luminous galaxy samples (with luminosity above L*). For 
low-luminosity samples, the Face model stands out in reproducing the data, with the 
fpeak model slightly worse, while the Mace model fails to fit the data. We discuss the 
implications of the modelling results. 
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1 INTRODUCTION 

The connection between the observed galaxy distribution 
and the underlying dark matter is a fundamental question in 
modern cosmology. It can help us understand the dark mat¬ 
ter component of the energy density distribution from the 
observed baryon components. The contemporary galaxy for¬ 
mation models assume that galaxies form and evolve within 
the dark matter haloes (White & Rees 1978). Therefore, we 
can use the dark matter haloes to build the connection be¬ 
tween the luminous and dark sides of the universe. 

There are multiple ways of linking galaxies to the 
dark matter haloes. The most straightforward method is 
to employ the hydrodynamic simulations to take into ac¬ 
count the complicated physics involved in the galaxy for¬ 
mation and evolution (see the latest such simulations in 
e.g. Vogelsberger et al. 2014a; Schaye et al. 2015), as well 
as the semi-analytic models that are built on the halo 
merger trees from N-body dark matter simulations (e.g. 
Bower et al. 2006; Croton et al. 2006; Somerville et al. 2008; 
Guo et al. 2011). But the poorly understood galaxy forma¬ 
tion physical processes related to baryons make such meth¬ 
ods model dependent and difficult to satisfactorily reproduce 
the observations in the current data accuracy. Other sta¬ 
tistical methods are then developed to evade the necessity 
of including the galaxy formation physics and to make use 
of the population of dark matter haloes whose formation 
is dominated by gravity and well understood. Such meth¬ 
ods aim at empirically establishing the connection between 
galaxies and dark matter haloes from statistical distribu¬ 
tions of galaxies like galaxy clustering, and then the galaxy- 
halo connection is used to constrain galaxy formation and 
evolution. The most popular models are the halo occupa¬ 
tion distribution (HOD; Jing et al. 1998; Peacock & Smith 
2000; Berlind & Weinberg 2002; Zheng et al. 2005, 2009; 
Leauthaud et al. 2012; Guo et al. 2014; Skibba et al. 2015; 
Zu & Mandelbaum 2015), the closely related conditional 
luminosity function (CLF; Yang et al. 2003, 2004), and 
the subhalo abundance matching (SHAM; Kravtsov et al. 
2004; Conroy et al. 2006; Vale & Ostriker 2006; Wang et al. 
2007; Behroozi et al. 2010; Guo et al. 2010; Moster et al. 
2010; Nuza et al. 2013; Rodriguez-Puebla et al. 2013; 
Sawala et al. 2015; Yamamoto et al. 2015). All of these 
methods are based on the halo framework, by assuming that 
all galaxies reside in the haloes. In this paper we focus on 
the detailed and quantitative model comparisons between 
the HOD and SHAM methods. 

The HOD description includes the probability P{N\M) 
of finding N galaxies of certain properties in a dark mat¬ 
ter halo of virial mass M, and the spatial and velocity dis¬ 
tribution of those galaxies inside haloes. Analytical meth¬ 
ods have been developed within the HOD (or CLF) frame¬ 
work to compute galaxy clustering statistics (e.g. Zheng 
2004; Tinker et al. 2005; van den Bosch et al. 2013). By us¬ 
ing dark matter haloes identified in high-resolution N-body 
simulations, the HOD model can be made accurate enough 
to interpret the observed high-precision galaxy clustering 
measurements from large galaxy surveys (Zheng & Guo 
2016), which overcomes the difficulty of modelling the effects 
of halo exclusion, nonlinear growth, and scale-dependent 
halo bias in the analytical HOD models (e.g. Zheng 2004; 
Tinker et al. 2005). Based on galaxy formation models, 


galaxies in the HOD model are further categorized into cen¬ 
tral and satellite galaxies according to their spatial distribu¬ 
tion within the haloes. In many applications, central galaxies 
are usually put at halo centres and assumed to have the ve¬ 
locities of the haloes, while satellite galaxies are assumed 
to follow the spatial and velocity distributions of the dark 
matter in the haloes. However, the HOD description itself 
allows the freedom of varying the above assumptions, by in¬ 
troducing spatial bias and velocity bias. For example, the re¬ 
cent modelling of small-scale redshift-space clustering mea¬ 
surements using both the Sloan Digital Sky Survey (SDSS) 
Main galaxy sample (Guo et al. 2015c) and SDSS-HI Baryon 
Oscillation Spectroscopic Survey (Guo et al. 2015a) shows 
that central galaxies have velocity offsets with respect to the 
halo bulk velocities and the velocity distribution of satellite 
galaxies generally differs from that of the dark matter. By 
including such velocity bias factors, the HOD model is able 
to reproduce the observed galaxy two-point correlation func¬ 
tions (2PGFs) in both projected and redshift spaces remark¬ 
ably well and to interpret successfully higher-order statistics, 
like the three-point correlation functions (Guo et al. 2015b). 

The development of the high-resolution A-body simu¬ 
lations enables the identification of the substructures within 
the dark matter haloes, i.e. the subhaloes, which were dis¬ 
tinct haloes before they fell into the current host haloes (see 
e.g. Klypin et al. 2016; Pujol et al. 2014). As in the litera¬ 
ture, we refer to virialized haloes that are not subhaloes of 
another halo as distinct haloes. The subhaloes are believed 
to be the natural local environments for the satellite galaxies 
in the host haloes. Due to their trackable merger histories, 
the subhaloes provide a powerful way to study the galaxy 
evolution once the connection between satellite galaxies and 
subhaloes is built. The basic idea of the SHAM method is to 
assume a monotonic relation between certain galaxy prop¬ 
erty and certain halo (including subhalo) property. For ex¬ 
ample, the one-to-one correspondence between the galaxies 
and the dark matter haloes (and subhaloes) can be made by 
ranking the galaxies in order of their luminosity and popu¬ 
lating the more massive haloes (and subhaloes) with more 
luminous galaxies, i.e. the number density of galaxies above 
a luminosity threshold is matched to that of haloes above 
a mass threshold, establishing a link between galaxy lumi¬ 
nosity and halo mass. In this way, the galaxies relating to 
the host haloes are naturally central galaxies while those in 
the subhaloes are satellite galaxies. In practice, the SHAM 
method always includes a scatter in the galaxy-halo/subhalo 
relation, which has its physical origin. 

Accurately identifying and defining the subhaloes In the 
simulations should take into account the effects of both the 
simulation resolution and baryon physics (Weinberg et al. 
2008). While the resolution effect is less severe with the 
emergence of more and more high-resolution simulations, the 
baryon physics can still give rise to an important systematic 
effect for the SHAM method. Gompared to the stellar com¬ 
ponents of satellite galaxies that are more gravitationally 
bound, the dark matter in subhaloes suffers more from tidal 
heating and stripping. Galaxy properties are therefore more 
closely connected to subhalo properties that are less affected 
by the tidal effects. The original SHAM method is improved 
by relating the satellite galaxy properties to the maximum 
circular velocity or the mass of subhaloes at the epoch of ac¬ 
cretion (see e.g. Conroy et al. 2006; Vale & Ostriker 2006) 
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or over the entire merger history (see e.g. Moster et al. 2010; 
Reddick et al. 2013). Such improvement is shown to repro¬ 
duce better the observed galaxy clustering measurements. 
However, some effects are yet to be taken into account in the 
SHAM model. For example, some subhaloes can be tidally 
destructed while the corresponding satellite galaxies (stellar 
component) can still survive (the so-called orphan galaxies; 
Wang et al. 2006; Moster et al. 2010), and the usual SHAM 
model based on A^-body simulations would miss such a pop¬ 
ulation. 

The different halo and subhalo models have been stud¬ 
ied extensively in the previous literature (see e.g. Yang et al. 
2012). Yang et al. (2009) used CLF method to explore 
the consequence of the stellar mass evolution of the satel¬ 
lite galaxies assuming the same stellar-halo mass relation 
(SHMR) for host haloes at present day and subhaloes at 
the time of accretion. They used the galaxy group cata¬ 
logues (Yang et al. 2005) constructed from SDSS DR4 to 
predict the stellar mass function of the satellite galaxies 
and emphasize the importance of including intracluster stars 
in the galaxy evolution. Neistein et al. (2011a) studied the 
SHMR for central and satellite galaxies in the SHAM using 
a set of semi-analytical models (SAMs). They found that 
adopting the same SHMR for central and satellite galax¬ 
ies cannot reproduce the clustering measurements in SAMs. 
Neistein et al. (2011b) further extended the SHAM models 
by allowing the stellar mass of the satellite galaxies to also 
depend on the host halo mass and concluded that the SHMR 
is not well constrained from the clustering measurements 
alone. Rodriguez-Puebla et al. (2012) also found that differ¬ 
ent SHMRs for central and satellite galaxies are favoured 
by the observation by using the central and satellite stel¬ 
lar mass functions from the galaxy group catalogues. The 
SHAM technique is also examined in the smoothed particle 
hydrodynamics simulations by Simha et al. (2012), and it is 
found to overpopulate massive haloes because of severe stel¬ 
lar mass loss of some satellite galaxies. Reddick et al. (2013) 
compared the connection between different halo properties 
and the galaxy stellar mass in the SHAM models. The scat¬ 
ter between galaxy stellar mass and halo property is con¬ 
strained by the galaxy clustering measurements and the con¬ 
ditional stellar mass functions. They found that the model 
with the halo peak circular velocity provides the best agree¬ 
ment with the data. 

The galaxy projected 2PCFs have been extensively 
used previously in constraining the models. However, the 
redshift-space clustering measurements have additional in¬ 
formation about the galaxy velocity field and therefore can 
help distinguish different models. In this paper, we compare 
quantitatively the HOD and (extended) SHAM methods 
in modelling both the projected and redshift-space cluster¬ 
ing of the volume-limited luminosity-threshold galaxy sam¬ 
ples in the SDSS Data Release 7 (DR7). The galaxy-halo 
connections for the central and satellite galaxies are al¬ 
lowed to be different in the extended SHAM models. Un¬ 
like Rodn'guez-Puebla et al. (2012), who apply SHAM sep¬ 
arately to central and satellite stellar mass functions based 
on a group catalogue, we constrain all parameters of the 
extended SHAM models using the galaxy clustering mea¬ 
surements and the galaxy sample number densities. In Sec¬ 
tion 2, we describe the measurements of our galaxy sam¬ 
ples and the modelling method. The subhalo distributions 


in the high-resolution simulations are investigated in Sec¬ 
tion 3. We present the results of modelling the projected and 
redshift-space clustering measurements in Sections 4 and 5, 
respectively. Finally, we summarize our results and discuss 
the possible applications in Section 6. Throughout the pa¬ 
per, we assume a spatially flat A cold dark matter cosmol¬ 
ogy, with Qrn = 0.307, h = 0.678, and ag = 0.823, consis¬ 
tent with the constraints from Planck (Planck Collaboration 
2014). The halo mass used in this paper is calculated based 
on the given spherical overdensities of a viral structure 
(Bryan & Norman 1998). 


2 MEASUREMENTS AND MODELS 

In this paper, we use the galaxies in the New York 
University Value-Added Galaxy Catalog (NYU-VAGC; 
Blanton et al. 2005) for the SDSS DR7 Main galaxy sample 
(Abazajian et al. 2009). We further construct eight volume- 
limited luminosity threshold samples, with absolute r-band 
Petrosian magnitude Mr varying from —18 to —21.5 with 
step size of 0.5. We refer the readers to Guo et al. (2015c, 
hereafter G15) for more details. 

The projected 2PCF Wp{rp) and redshift-space 2PCF 
monopole (^o(s)), quadrupole (^ 2 (s)) and hexadecapole 
(^ 4 (s)) moments are measured for each sample, where Vp 
and s are the transverse and redshift-space separations of 
galaxy pairs, respectively. The galaxy 2PCF measurements 
range from small scales of 0.1 /i“^Mpc to intermediate scales 
of 25/i“^Mpc. The projected 2PCF Wp{rp) is measured by 
integrating the redshift-space 3D 2PCF to a maximum light- 
of-sight pair separation of 40h,“^Mpc (also adopted in all 
the models). The covariance matrix for each sample is es¬ 
timated from jackknife resampling method (Zehavi et al. 
2011; Guo et al. 2013). 

We follow the simulation-based model method laid out 
in Zheng & Guo (2016) to interpret the galaxy 2PCF mea¬ 
surements within the HOD and SHAM frameworks. It has 
been used in G15 and Guo et al. (2015a). With haloes iden¬ 
tified in a high-resolution N-body simulation, this method 
tabulates all the necessary halo components in calculat¬ 
ing galaxy 2PCFs, including one-halo pair distributions and 
two-halo 2PGFs from pairs composed of different combina¬ 
tions of central and satellite galaxies. With such tables and 
a specified description/parametrization of galaxy-halo rela¬ 
tion (e.g. within the HOD and SHAM frameworks), galaxy 
2PCFs are simply obtained by summing over different, pre¬ 
calculated table elements, weighted by the corresponding 
galaxy occupation statistics. With a given set of HOD (and 
SHAM) parameters, this method is equivalent to, but more 
efficient than, directly assigning galaxies to haloes (and sub¬ 
haloes) in the simulation and measuring the corresponding 
model 2PCFs. Compared to analytical models, it ensures 
high accuracy by using the halo information directly from 
the simulations and by calculating 2PCFs with exactly the 
same binning scheme as in the data. Finally, this method 
provides an efficient way to explore the parameter space for 
different models, which serves well our purpose in this paper. 

We use the MultiDark simulation of Planck cosmol¬ 
ogy (MDPL^; Klypin et al. 2016), with the cosmological pa- 

^ The simulation is named as MDPL2 and publicly avail- 
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rameters of Qm = 0.307, Qb = 0.048, h = 0.678, Us = 
0.96, and as = 0.823. The simulation has a volume of 
1 h~^ Gpc® (comoving) and the mass resolution is as low 
as 1.51 X lO®/i“^M 0 . The simulation output at 2 : = 0 is 
adopted to model all our luminosity threshold galaxy sam¬ 
ples. To see how simulation resolution affects the subhalo 
population, we also investigate a smaller simulation that 
has the same cosmological parameters as MDPL, but with 
a volume of 0.4® /i“® Gpc®, which is referred to as SMDPL 
(Klypin et al. 2016). This simulation was run with the same 
number of particles (3840®) as in MDPL, so its mass resolu¬ 
tion is 9.6 X 10^ h~®MQ, about 15.6 times finer than MDPL. 

In both MDPL and SMDPL, the dark matter haloes 
and subhaloes are identihed with the Rockstar phase- 
space halo hnder (Behroozi et al. 2013), where the spher¬ 
ical haloes are found from the density peaks in the phase 
space. The Rockstar code is efficient and accurate to find 
the bound (sub)structures in the simulations (Onions et al. 
2012; Knebe et al. 2013). Note that different from G15, the 
unbound particles are removed from our halo (and subhalo) 
catalogue. The halo (subhalo) velocities are dehned as the 
average particle velocity within the innermost 10% of the 
halo (subhalo) radius, which is different from the dehnition 
of centre-of-mass velocity (i.e. bulk velocity) of haloes in 
G15. The different halo velocity definitions will affect the in¬ 
ferred galaxy velocity bias parameters. This change of halo 
definition is to match those in the publicly available Rock¬ 
star halo and subhalo catalogues. However, since we use the 
same halo catalogues for the HOD and SHAM models, the 
comparison in this paper is not affected by the definitions of 
haloes and halo properties. We consider three sets of models 
to connect galaxies to the dark matter haloes in the follow¬ 
ing sections. To avoid confusion, the host haloes and distinct 
haloes mentioned hereafter refer to the haloes that are not 
subhaloes of any other dark matter haloes. 


2.1 The HOD Model 


For a sample of galaxies above a given luminosity threshold, 
the HOD model includes five parameters for describing the 
average number N of galaxies in distinct haloes of mass Mh 
(Zheng, Coil & Zehavi 2007) 


{N{MG) 

(Ncen(Mh)} 


(Afsat(Mh)) 


(Arcen(Mh))-f (Wat(Mh)), (1) 

1 ^ log Mh - log Mmin 

2 y rriogMi, 

(iVcen(Mh)) . (3) 



where the two central galaxy parameters Mmin and uiogMi, 
describe the characteristic minimum mass of haloes that 
host the sample of galaxies ((Acen(Minin)) = 0.5) and the 
characteristic width of the transition mass range for haloes 
hosting zero to one galaxy. The three parameters for the 
satellite galaxies are the cutoff mass scale Mo, the normal¬ 
ization mass scale M( and the power-law slope a at the high- 
mass end. In this paper, we hx a = 1 in order to match the 
slope of the subhalo occupation function in massive haloes 


able at https://www.cosmosim.org/cms/simulations/multidark- 
project / mdpl2 / 


and to reduce the degrees of freedom (dof) to match that 
in the SHAM model (see below). In the following sections, 
we also compare two useful derived parameters, the charac¬ 
teristic mass Ml of haloes hosting on average one satellite 
galaxy and the inferred satellite fraction /sat (dehned as the 
fraction of the satellite galaxies in the sample). 

We note that to compute the mean number of intra-halo 
central-satellite pairs in the model, the occupation num¬ 
bers of central and satellite galaxies are assumed to be in¬ 
dependent of each other. That is, we have (AcenAsat) = 
(Acen)(Asat}. Changing the assumption of the dependence 
between the central and satellite occupations only has min¬ 
imal effects on the HOD parameters, as discussed in Fig. 10 
of Guo et al. (2015a). Compared to the case of having satel¬ 
lites only in haloes with central galaxies for a given galaxy 
sample, we now can populate satellites in some low mass 
haloes without central galaxies. As a consequence, the best- 
htting a will decrease and the central galaxy velocity bias 
will slightly shift to lower values, while other HOD param¬ 
eters only change by about 0 . 1 %. 

In our fiducial model, the central galaxies are assigned 
the positions and velocities of the distinct haloes, while the 
random dark matter particles in the haloes are selected to 
represent the satellite galaxies. As in G15, we introduce an 
additional central galaxy velocity bias parameter Qc in the 
HOD model to allow the central galaxy velocity to differ 
from that of the halo velocity, with a velocity dispersion 
equal to Oc times the dark matter particle velocity dispersion 
cr„ in the haloes. We also include the satellite velocity bias 
parameter Us- The relative velocity of a satellite galaxy to 
the halo centre is scaled by the satellite velocity bias as to 
take into account the possible velocity differences between 
the dark matter particles and the satellite galaxies. In the 
frame of a single halo, the satellite galaxy velocity bias is 
the same as the ratio between the velocity dispersions of the 
satellite galaxies (asut) and the dark matter particles within 
the haloes, i.e. as — (Tsat/u„. We refer the readers to G15 
for more details. In total, we have six free parameters in the 
HOD model, four for the mean occupation function (Mmin, 
ciogMfe, Mo, and M() and two for the velocity bias (oc and 
as). 

We apply a Markov Chain Monte Carlo (MCMC) 
method to explore the probability distribution of the model 
parameters. The likelihood surface is determined by x^, con¬ 
tributed by the projected 2PCF Wp, the redshift-space multi¬ 
poles ^ 0 , C 2 and ^ 4 , and the observed galaxy number density 

rig, 

= + (4) 

ang 

where C is the full error covariance matrix and the data 
vector ^ = [tUp,4o,^ 2 ,^ 4 ]. The quantity with (without) a 
superscript is the one from the measurement (model). 
To take into account the finite volume of the simulations 
our model is based on, we also apply a volume correction of 
1 +Vobs/Vsim to the covariance matrix (Zheng & Guo 2016), 
where Mbs and Mim are the volumes for the observed galaxy 
sample and the simulation, respectively. For each sample and 
each model, we perform MCMC runs with length of two mil¬ 
lion to explore the parameter space and to choose the set 
of best-fitting parameters. For the chain, at each step of the 
random walk, a set of trial HOD parameters are generated. 
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Covariances among parameters are taken into account when 
proposing the trial move in order to improve the efficiency of 
the chain. The probability of keeping the trial HOD param¬ 
eters depends on the difference Ax^ = Xnew — Xm between 
the old and new (trial) sets of parameters, i.e. 1 for Ax^ ^ 0 
and exp(—Ax^/2) for Ax^ > 0. 

2.2 The SHAM Models 

The simplest SHAM model usually assumes a monotonic re¬ 
lation between the galaxy luminosity (or stellar mass) and 
a given halo property (e.g. halo mass), by assigning more 
luminous galaxies to more massive haloes. The galaxy lu¬ 
minosity function is then preserved by matching the num¬ 
ber density of the galaxy sample to that of the haloes (see 
e.g. Conroy et al. 2006). Since such an assignment is only 
based on the halo property (e.g. halo mass), the distinct 
halo and subhalo in the simulations are not distinguished 
between each other. The relation between the galaxies and 
the haloes (including both distinct haloes and subhaloes) is 
completely determined by the number density distribution 
(e.g. luminosity function) of the galaxy sample. Thus, there 
is no free parameter in such models. A more flexible SHAM 
model is typically introduced to allow a scatter between e.g. 
the galaxy luminosity and the halo mass. Such a scatter is 
necessary especially when modelling the clustering of the 
luminous galaxies (see e.g. Reddick et al. 2013). 

There are a few popular SHAM models that connect 
the galaxy luminosity to the different halo properties. In this 
paper, we only consider the following three SHAM models 
using different halo properties. 

(1) Mace. For a distinct halo, it is the current halo mass, 
while for a subhalo, it is the mass at the last epoch when 
the subhalo was a distinct halo (before accreted to another 
halo). 

(2) Race. For a distinct halo, it is the current maximum 
circular velocity, while for a subhalo, it is the maximum 
circular velocity at the last epoch of being a distinct halo 
(before accreted to another halo). 

(3) Hpeak. For both distinct haloes and subhaloes, it is 
the peak circular velocity over the entire merger history. 

The properties Mace and Face are commonly used in 
the SHAM models because they are closely related to the 
halo merger history, while recent results suggest that choos¬ 
ing Fpeak in the model leads to better agreement with the 
data (e.g. Moster et al. 2010). The Fpeak of a distinct halo 
or subhalo is usually significantly larger than Face, because 
the peak circular velocity is generally achieved earlier in 
time than the accretion. The tidal heating and stripping 
will later reduce the circular velocity of a subhalo even be¬ 
fore the accretion (see e.g. Fig. 1 of Chaves-Montero et al. 
2015). Reddick et al. (2013) compared different SHAM mod¬ 
els and found that Fpeak is more closely related to the galaxy 
stellar mass, while Mpeak (the maximum mass that a halo 
or subhalo has ever had in its merger history) is generally 
not successful in reproducing the clustering measurements. 
So we do not consider the Mpeak case in our SHAM mod¬ 
els. We will investigate these three models in the following 
sections. 

In implementing the SHAM models we allow a scat¬ 
ter between the galaxy property (here luminosity) and the 
adopted halo property. To facilitate the comparison with the 


HOD model, the scatter is parametrized in a way of using 
the functional form of Eq. 2 to assign galaxies to haloes. 
As an example of choosing Mace as the halo property, the 
probability of a distinct halo or subhalo having a galaxy in 
a given luminosity-threshold sample is 

log Mace - log Mmin ,acc N 1 

- 

<riog Mace / 

The scatter between galaxy property and halo property 
is encoded in the parameter criogMacc (Zheng et al. 2007), 
which is the only free parameter in Eq. 5. The characteristic 
mass scale Mmin,acc can then be determined by matching 
the sample number density. For other two halo properties, 
we only need to replace the mass in Eq. 5 to the correspond¬ 
ing terms for Ficc and Fpeak- Note that the SHAM model 
we use here is more flexible than the commonly adopted 
one. The usual SHAM model assumes one scatter parame¬ 
ter and performs the abundance matching for galaxies in the 
full range of observed luminosity. Here we model a series of 
luminosity-threshold samples, and each has its own scatter 
parameter. We are effectively allowing the scatter between 
the galaxy luminosity and the halo property to vary with 
the halo property. 

In the SHAM model we use, a further improvement 
is related to the determination of the scatter parameter. 
We do not simply assign a scatter parameter for a given 
luminosity-threshold sample. The final criog Mace used in each 
luminosity-threshold sample is determined from the model 
with the best-fitting x^ to the galaxy projected 2PCFs. We 
emphasize that even though the scatter parameter we intro¬ 
duce here is formally expressed in terms of the halo prop¬ 
erty (mass or circular velocity), it is originally derived from 
the scatter in the (lognormal) galaxy luminosity distribu¬ 
tion at a fixed halo mass or circular velocity (see Eq. 4 in 
Zheng et al. 2007). The meaning of uiogMacc uot the scat¬ 
ter on the halo mass at a fixed galaxy luminosity, but rather 
the width of the cutoff profile. We can conveniently convert 
Clog Mace ^he scatter on the galaxy luminosity ciogi;, at 
fixed halo mass using the local slope of the L-Macc relation 
at the threshold luminosity, as will be shown in the following 
sections. 

For central galaxy occupation distribution in the Mace 
model, we can directly compare Mmin.acc to Mmin in the 
HOD model, because they both refer to the typical cutoff 
mass of the distinct haloes that host the galaxies in the 
sample of interest. For satellite galaxies in subhaloes of Mace 
at the time of accretion, with the simulations we can con¬ 
veniently convert P(Macc) in Eq. 5 to the satellite mean 
occupation function (Asat(Mh)) in host haloes of mass Mh. 
From the average occupation number (A( 5 ub(Macc|Mii)) of 
subhaloes with mass Mace in each host halo with mass Mh, 
we have 

(iVsat(Mh)) = ^'(A^acc)(N,ab(Macc|Mh)). (6) 

-^acc 

For the cases of Face and V),eak models, the mean satellite 
function can be computed similarly by replacing the mass 
in Eq. 6 to the corresponding velocity variable. 

Overall, the SHAM model we use here is more flexible, 
compared to the traditional one. We allow the scatter to de¬ 
pend on the halo property, and determine it by fitting the 
projected 2PCF. The number density of the galaxy sample 


P(Macc) = 


1 -k erf 
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is ensured to be matched by tuning the characteristic halo 
mass scale Mmin.acc- In what follows, we further extend or 
generalize the SHAM model to make it even more flexible, 
with the relevant parameters determined by both the galaxy 
abundance and the galaxy clustering (in projected and red- 
shift spaces). 

2.3 A Subhalo Clustering and Abundance 
Matching Model 

The galaxy luminosity (or halo mass/property) dependent 
scatter extends the SHAM models. However, as will be 
shown below, this extension is still not capable of satisfac¬ 
torily interpreting the observed galaxy 2PCFs. We therefore 
add further flexibilities to the SHAM model and make it a 
well parametrized model to fit both the galaxy abundance 
and clustering, which can be referred to as subhalo cluster¬ 
ing and abundance matching (SCAM) model. 

For a given luminosity-threshold galaxy sample, we con¬ 
struct the SCAM model by allowing the mass scale Mmin,acc 
and scatter parameter uiogMacc in E*!- 5 lo be different for 
the distiirct haloes (central galaxies) and subhaloes (satel¬ 
lites). That is, we now have probabilities Pcen(Afacc) and 
Paat(Macc). The exteusious for the case of Face and Fpeak 
are similar. Once a halo property is chosen to use, we 
have four parameters for the central and satellite mean 
occupation functions. Such separate parametrizations for 
the central and satellite components in the SCAM model 
are supported by the recent findings of the differences be¬ 
tween the central and satellite galaxies in the SHAM models 
(Rodrfguez-Puebla et al. 2012; Watson & Conroy 2013). 

To model the redshift-space 2PCFs with the SCAM 
model, the treatment of the central galaxies is the same as in 
the HOD model and a central galaxy velocity bias parame¬ 
ter Oc is introduced. Since the subhaloes are selected to host 
satellite galaxies, we also apply a satellite galaxy velocity 
bias by scaling the velocity of a subhalo relative to its host 
halo with a factor of as- So in total we have six free parame¬ 
ters for the redshift-space modelling with the SCAM model. 
As with the HOD model, the parameter space is explored 
with the MCMC method with the likelihood determined by 
the 2PCFs and the galaxy number density (Eq. 4). 


3 PARTICLE AND SUBHALO 

DISTRIBUTIONS IN SIMULATIONS 

Before we apply the HOD/SHAM/SCAM models to model 
the clustering measurements, it is important to understand 
the particle and subhalo distributions in the simulations. 
As subhaloes are related to satellites in SHAM/SCAM, the 
HOD model in this paper connects satellites to dark matter 
particles. Any differeirce seen in the particle and subhalo dis¬ 
tributions will be useful for us to understand the modelling 
results. 

We show in Fig. 1 the detailed comparisons between 
the subhalo distributions in the MDPL and SMDPL simula¬ 
tions. Panel (a) shows the subhalo mass functions in the two 
simulations. The simulation resolution does affect the ideirti- 
fication of the subhaloes in the two simulations. But for sub¬ 
haloes of Mace > 2.8 X 10^^ /i“^Mq, the subhaloes in MDPL 
are about 90% complete, compared to that of the SMDPL. 


In terms of circular velocities, subhaloes are 90% complete 
in MDPL for Face > 176kms“^ and Fpeak > 184kms“^, 
respectively. 

As will be shown in the following sections, many faint 
satellite galaxies in the SHAM/SCAM model are predicted 
to reside in subhaloes of mass Mace around 10^^ 

The corresponding subhaloes identified in MDPL simulation 
suffer from the resolution effect, so for the SHAM/SCAM 
method we will model the faint galaxy samples of Mr < —18, 
— 18.5, —19, and —19.5 using the SMDPL simulation instead 
and model the more luminous samples using the MDPL sim¬ 
ulation. The volume Fim of the SMDPL is much larger than 
the survey volume Mbs of these faint samples (G15), so the 
volume correction (the 1 -|- Fibs/Fim factor) to the covari¬ 
ance matrix (Zheng & Guo 2016) is not significant. For the 
HOD model, since we are randomly selecting the dark mat¬ 
ter particles to represent the satellite galaxies, the resolution 
of the MDPL simulation is high enough to model all the lu¬ 
minosity threshold samples. So we do not use the SMDPL 
for the HOD models. We have verified that usiirg SMDPL for 
modelling the faint galaxy samples with the HOD method 
produces the same results as using the MDPL simulation. 
This is consistent with the fact that the mass functions for 
the distinct haloes in MDPL and SMDPL agree down to 
haloes of about 5 x 10^° /i“^Mq (Rodriguez-Puebla et al. 
2016). 

Panels (b) and (c) display the number density profiles of 
subhaloes in host haloes around Mb = 10^^ and 10^“^ /i“^Mq 
as a function of subhalo properties (Mace, Face, and Fpeak, as 
labelled). For each subhalo property, the density profiles are 
normalized to be the same at the host halo virial radius and 
offsets are added for the curves of different subhalo prop¬ 
erties for clarity. In each set of curves, the black solid line 
is the density profile of the dark matter particles. The solid 
lines are for the subhalo density profiles in MDPL, while 
the dotted lines are for those in the SMDPL. The red and 
blue curves are for subhaloes selected using different mass 
or velocity thresholds. For the Mace model, the red and blue 
curves are for Mace > 10^^ and > 10 ^^'®/i“^Mq, respec¬ 
tively. For the Face (Fpeak) model, the red and blue curves 
are for Face (Fpeak) larger than 10^'® and lO^ ’^kms”^, re¬ 
spectively. In general, the density profile of the subhaloes is 
shallower than that of the dark matter (see e.g. Gao et al. 
2004; Pujol et al. 2014). But as the mass ratio Macc/Mh (or 
velocity ratio) increases, the subhalo density profile is ap¬ 
proaching that of the dark matter. More importantly, such 
a trend is not affected by the mass resolution of the sim¬ 
ulations, which indicates that the scarce of subhaloes in 
the inner regions of the host haloes is most likely caused 
by the strong tidal stripping effect (see e.g. Springel et al. 
2008). Since the stellar components of satellite galaxies are 
more tightly bouird, they can still survive to be observed 
as satellites even if the corresponding subhaloes lose their 
identities from tidal destruction. The possibly different dis¬ 
tribution profiles between subhaloes and satellite galaxies 
will then be an important factor to consider when interpret¬ 
ing the clustering modelling results with both the HOD and 
SHAM/SCAM models. 

Panel (d) shows the 3D dark matter velocity disper¬ 
sions cr„ as a function of the host halo mass Mh. The two 
simulations show very good agreement with each other. For 
distinct haloes with mass Mb > 10^^ /i'^Mq, the velocity 
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Figure 1. Comparisons of the subhalo distributions between the MDPL and SMDPL simulations. In each panel, solid and dotted curves 
are from the MDPL and SMDPL simulations, respectively. Panel (a): subhalo mass functions. Panel (b): subhalo spatial distribution 
profile in the host haloes of The red and blue curves are for subhaloes selected using different mass or velocity 

thresholds. For the Mace model, the red and blue curves are for Mace > 10^^ and > ^“^Mq, respectively. For the Vkee and Ppeak 

models, the red and blue curves are for Vkee (or Vpeak) larger than 10^’^ and lO^'^kms”^, respectively. For each model, the profiles 
are normalized to be the same at the host halo virial radius and the curves are separated for different models for clarity. The black 
solid lines are the density profiles for the dark matter particles in each case. Panel (c): similar to panel (b), but for the host haloes of 
Mh~10 ^^/i“^Mq. Panel (d): 3D dark matter velocity dispersion in distinct haloes of different mass Mh- The shaded area shows the 
scatter around the velocity dispersion measurements in SMDPL. 


dispersion measurements are not significantly affected by the 
simulation resolutions. 

Since we have the 3D velocity for each subhalo in the 
simulations, an interesting question is the velocity bias of 
the subhaloes with respect to the dark matter velocity dis¬ 
tribution. We measure the velocity dispersions asub for sub¬ 
haloes of different masses in different host haloes, and es¬ 
timate the average subhalo velocity bias Qsub through the 
following equation, 


(“Bub) = (7) 

which is an unbiased estimate of the subhalo velocity bias 
even for a small number of subhaloes in each host halo. The 
subhalo velocity dispersion Usub in each halo is calculated 


by 

N 

f^sub = - Vhll^, (8) 

i = l 

where Usub and Uh are the 3D velocities of the subhalo and 
the corresponding host halo, respectively, and N is the num¬ 
ber of subhaloes of interest in each halo. Note that our def¬ 
inition of subhalo velocity dispersion is different from that 
of Wu et al. (2013), who used the mean velocity of all the 
subhaloes in the host halo instead of Vh in Eq. 8. That is, 
we include the dispersion in the offset between the mean 
velocity of subhaloes and the halo velocity. Also, the sub¬ 
halo velocity bias in Wu et al. (2013) is estimated through 
CsMb/o-a), which is a biased estimator of the velocity bias 
and needs corrections for small N. This can be seen by con- 
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SHAM models with scatters. The measurements for 
error bars. The different SHAM models are shown as 
e measurements are shown in the bottom part of each 
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Figure 3. Normalized covariance matrices for the corresponding 
2PCF measurements shown in Fig. 2. From left to right and top to 
bottom, the covariance matrices are for the luminosity threshold 
samples from Mr < —18 to Mr < —21.5. 


si derin g a ID velocity distribution with zero mean: while 
gives the dispersion a, in general (|t)|) (a.k.a. mean 
absolute deviation) does not. The reason that we choose 
Uh as the reference velocity is to match the way we define 


the satellite galaxy velocity bias in the HOD model. We 
measure the subhalo velocity bias ctsub for subhaloes with 
masses Mace > 10^^ in haloes of different Mh in both 

simulations. The measured asub varies from 1.02 to 1.11 for 
Mace in the range of lO^^-lO^® The lower mass sub¬ 

haloes have slightly larger values of Ogub. This trend of tTgub 
with the subhalo mass is less significant than that in Fig. 1 
of Wu et al. (2013). We find that even for the most mas¬ 
sive subhaloes in their host haloes, the value of agub is still 
around 1, which is much larger than the value of about 0.8 
inferred from Wu et al. (2013) (We recover the same values 
of Qgub as in their Fig. 1 when switching to their estimator). 
Note that the haloes and subhaloes in Wu et al. (2013) are 
also identified using the Rockstar code. The above difference 
is mainly caused by the biased estimator they use, with a 
small contribution from our choosing Vh in evaluating the 
velocity dispersion. 

As shown in G15, the satellite galaxy velocity bias Os 
from HOD modelling the redshift-space clustering of our 
sample is generally smaller than 1, with a typical value of 
0.8. Therefore, the difference between as and agub indicates 
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Figure 4. Best-fitting of the different SHAM models from 
Jtip-only data for the different luminosity threshold samples. The 
number of dof of the models is shown as the horizontal dashed 
line. 

the necessity of including satellite velocity bias in the sub¬ 
halo models when modelling the redshift-space clustering 
using SHAM/SCAM. 


4 MODELLING THE PROJECTED 2PCFS 

In the following sections, we will consider the modelling of 
the projected 2PCF only (wp), as well as the modelling of 
both the projected and redshift-space 2PCFs (uip-|-5o,2,4). To 
guide the readers, we list all the measurements and models 
used in the following sections in Table 1. When only the 
Wp is used in constraining models, the contribution to 
from clustering will only include that from Wp in Eq. 4, i.e. 
i = Wp. 

We first consider the modelling of the projected 2PCF 
Wp{rp) only, which is commonly used in constraining the 
HOD and SHAM parameters. In the modelling of Wp, we 
do not include the velocity bias parameters, because the 
projected 2PCF is integrated over the line of sight and hence 
relatively insensitive to the galaxy velocities. 

4.1 Results from the SHAM Models 

We first compare the modelling results from the three SHAM 
models (based on Mace, 14cc, and Epeak, respectively) in¬ 
cluding scatters as described in §2.2. Fig. 2 shows the best¬ 
fitting SHAM models to Wp (vp) for the eight volume-limited 
luminosity threshold samples in SDSS DR7. The different 
SHAM models are shown as the different colour lines. Over¬ 
all, the Vpeak model seems to provide the best descriptions 
for all the galaxy samples, consistent with the conclusions 
of Reddick et al. (2013). The Mace and Face models sig¬ 
nificantly underestimate the small-scale clustering for faint 
galaxies of threshold luminosity Mr fainter than —20.5. This 
can be attributed to the shallower subhalo distribution pro¬ 
files (Fig. 1). The Epeak model provides better fittings to the 
data, because the values of Epeak for subhaloes are usually 


much larger than Eacc- We note that in Fig. 1 the red and 
blue curves for Eicc and V),eak are selected using the same 
thresholds. For the same galaxy sample, the thresholds of 
Eacc and Epeak would be different, and the density profiles 
for the subhaloes selected using the best-fitting Epeak model 
is closer to the dark matter distribution than using the best¬ 
fitting Eacc model. 

However, the goodness of fit to the data cannot be sim¬ 
ply judged by eye, because the full covariance matrices of the 
measurements need to be taken into account. Each panel of 
Fig. 3 denotes the normalized covariance matrix for the cor¬ 
responding 2PCF measurements shown in Fig. 2. The best¬ 
fitting x^ for each model is displayed in Fig. 4. For example, 
from Fig. 2, it seems that the Eacc model fits slightly better 
than the Epeak model for the Mr < —19.5 sample. But the 
best-fitting x^ value of the Epeak model is in fact smaller due 
to the strong positive correlation in the neighbouring bins of 
the data measurements. The large off-diagonal terms of the 
covariance matrix are important for all the galaxy samples 
except for the most luminous one. 

As shown in Fig. 2 and Fig. 4, none of the three SHAM 
models can provide satisfactory fits for all galaxy samples. 
The Epeak model fits better for galaxy samples fainter than 
—21, while the Eacc model fits better for more luminous 
galaxy samples. The overall goodness-of-fit for the Epeak 
model is around x^/dof ~ 3. Therefore, the three SHAM 
models considered above can hardly be regarded as good 
models to the observed galaxy projected 2PCFs. We thus 
consider the more sophisticated and flexible subhalo models 
(SCAM) in the following section. 

We show in the left panel of Fig. 5 the comparisons of 
the characteristic cutoff circular velocity and the inferred 
scatters in galaxy luminosity in haloes with the cutoff circu¬ 
lar velocity in the Vlicc and Vpeak models, respectively. The 
more luminous galaxy samples have higher cutoff velocities, 
and the inferred cutoff for E^eak is generally about 0.1 dex 
higher than that for Vacc- 

As discussed in §2.2, the scatter niogz, in galaxy lumi¬ 
nosity at fixed circular velocity is encoded in the aiog v pa¬ 
rameter (width of the cutoff profile in the galaxy occupation 
function). Following Zheng et al. (2007) (see details in their 
Eq. 4), we have niogi = pniogv/v^, where p is the local 
power-law slope of the L-E relation, i.e. p = dlog L/dlog E. 
To obtain the local power-law slope, we make use of the for¬ 
mula proposed by Vale & Ostriker (2006) to fit the relation 
between the sample luminosity threshold L and the velocity 
cutoff E, (Eacc or Epeak), 

(9) 

[1 + (E/E)*''“1'/" 

The variables Lo, Et, a, b and k are the model parameters. 
As seen from the left panel of Fig. 5, L-V can also be well 
described by broken power laws, which justifies the use of 
local power-law slope p in the above equation. The resulting 
scatter criogi;, is shown in the right panel of Fig. 5. Most 
scatters are smaller than 0.3, and the scatters in the Epeak 
model are generally larger. We note that the uncertainties 
on the scatters of the faint galaxy samples are very large. If 
the scatters are not taken into account in the SHAM models, 
only low-luminosity samples can be reasonably fitted. The 
scatters become important for luminous galaxies of Mr < 
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Figure 5. Comparisons of the model parameters for the Vkcc and Vpeak models from fitting the riip-only data. The left-hand panel shows 
the characteristic cutoff circular velocity as a function of sample luminosity threshold for the two models. The right-hand panel shows 
the corresponding scatters in galaxy luminosity in haloes with circular velocities around the cutoff velocity (see the text). 
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Figure 6. Similar to Fig. 2, but for the SCAM models. The best-fitting HOD models are also included, shown as the black lines. 


MNRAS 000, 1-21 (0000) 






































Modelling Galaxy Clustering 11 


Table 1. Measurements used in the fits with different models 


Measurements 

Models 

Number of Free Parameters 

Section 

Comments 

Wp 

SHAM 

1 

§4.1 

Ug exactly matched 

Wp + rig 

SCAM/HOD 

4 

§4.2 


Wp + Co,2,4 + rig 

SCAM/HOD 

6 

§5 

SHAM results also presented 




Figure 7. Left: best-fitting of different models from fitting u;p-only data for the different luminosity threshold samples. The 
number of dof of the models is shown as the horizontal dashed line. Right: comparison between the galaxy number densities (curves) 
from the best-fitting models and the measured ones (circles). 



10 ^^ 10^2 10‘3 10*4 1011 10*2 1012 1014 1011 10*2 1012 10*2 10*2 10*4 


Figure 8. Mean halo occupation functions of the best-fitting HOD and SCAM models from fitting the lUp-only data for different 
luminosity threshold samples. 


— 20.5. Overall the scatter we infer is consistent with that in 4.2 Results from the SCAM and HOD Models 

the Tully-Fisher relation. „ 

The large x /dof values of the SHAM models are mostly 

caused by the underestimates of the small-scale clusterings. 
Since the small-scale galaxy pairs are dominated by the one- 
halo term, i.e. intra-halo galaxy pairs, the above underesti- 
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Figure 9. Comparisons of the model parameters of the four models from fitting the iTp-only data for the different luminosity threshold 
samples. The left-hand panel shows the comparisons of the characteristic cutoff mass Mmin of host haloes and the characteristic mass 
Ml of haloes hosting on average one satellite galaxy. The satellite fraction /sat is shown in the right-hand panel. 


mate could be an indication that subhaloes are not com¬ 
plete in representing satellite galaxies towards the centre of 
host haloes. Compared to the stellar components of satel¬ 
lite galaxies, subhaloes in A'^-body simulations are more eas¬ 
ily disrupted, especially in the central regions of the host 
haloes where the tidal stripping effect is more significant. 
Indeed, the differences in the distribution profiles between 
subhaloes and satellite galaxies have been seen from N- 
body and hydrodynamic simulations of the same initial con¬ 
ditions (e.g. Fig. 7 of Weinberg et al. 2008 and Fig. 2 of 
Vogelsberger et al. 2014b). 

However, if we work under the implicit assumption 
adopted in most SHAM models that satellites can only re¬ 
side in subhaloes identified in A^-body simulations, there is 
another way to improve the small-scale clustering fitted by 
adding additional components to the SHAM models. If we 
allow the central and satellite galaxies to have different oc¬ 
cupation distributions in the distinct haloes and subhaloes 
as in our SCAM models, the deficiency of small-scale galaxy 
pairs can be compensated by more satellite galaxies populat¬ 
ing subhaloes in lower mass host haloes. The galaxy num¬ 
ber density can still be preserved by increasing the cutoff 
mass (or velocity) scale of the central galaxies. This seems 
like an extreme model that possibly artificially increases the 
fraction of the satellite galaxies, as we allow the relation 
between central galaxies and distinct haloes and that be¬ 
tween satellites and subhaloes to be completely indepen¬ 
dent of each other in SCAM, which may not be true in re¬ 
ality. But on the other hand, there is some evidence that 
the connections of central and distinct haloes and those 
of satellite and subhaloes should be different (Yang et al. 
2009, 2012; Neistein et al. 2011b; Rodriguez-Puebla et al. 
2012; Wetzel et al. 2012; Watson & Conroy 2013). Within 
the SHAM framework, results from our SCAM model that 
jointly fits the 2PCFs and the galaxy number density may 
serve as a probe to the difference between central and satel¬ 
lite galaxies. 

The best-fitting HOD and SCAM models to the pro¬ 
jected 2PCF Wp are shown as the solid lines in Fig. 6. The 


of the model fittings are displayed in the left panel of Fig. 7. 
All the three SCAM models have much better best-fitting 
than the SHAM models, with only three more free parame¬ 
ters. Judged from the best-fitting x^ values, the HOD model 
and the I4cc model are the two best models. For galaxy sam¬ 
ples fainter than Mr = —20, the values of x^/dof of the two 
models are both around unity. For more luminous galaxies, 
the HOD model has a x^/dof ~ 1.8. Note that in the HOD 
model, we set a prior by fixing the high mass end slope a 
of the satellite mean occupation function to be unity, for 
the purpose of reducing the number of parameters to be the 
same as in the SCAM models. If we also allow a to vary, the 
best-fitting value of a for these luminous galaxies is about 
1.15 and the x^/dof would be significantly reduced to values 
around unity for the HOD model, as shown in table 2 of G15. 
Compared to a = 1, the higher-than-unity value of a im¬ 
plies that luminous satellite galaxies tend to populate even 
more massive haloes. We also note that due to the strong 
correlation in the off-diagonal elements of covariance matri¬ 
ces, the x^ cannot be simply judged from the ratios between 
the models and data, as explained in the previous sections. 
For example, for the faint galaxy sample of Mr < —19, the 
HOD and I4cc model has almost the same x^- However, the 
model predictions for Wp are quite different. 

Except for the Vj,eak model that has a strong variation 
of x^ with the sample luminosity, all other three models can 
fit the faint galaxy samples very well. That is, once we allow 
the central and satellite galaxies to have different relations 
to the host haloes and the subhaloes, the satellite occupa¬ 
tion can be adjusted to reproduce the small-scale clustering. 
For the most luminous galaxy sample of Mr < —21.5, all the 
four models have similar best-fitting x^ values. As will be 
shown in the following, the ratio between the typical sub¬ 
halo and the host halo mass is increasing with the galaxy 
luminosity (see e.g. Guo et al. 2014). According to Fig. 1, 
this makes the spatial distribution of subhaloes in the host 
haloes approach that of the dark matter, which explains why 
the SCAM models produce best-fitting x^ values more con¬ 
sistent with the HOD model. 
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The right panel of Fig. 7 shows the best-fitting galaxy 
number density for the different models. The Ypeak model 
has slightly lower galaxy number densities for the two sam¬ 
ples of Mr < —19 and Mr < —20, mainly responsible for 
the larger shown in the left panel. All other three mod¬ 
els reproduce the observed galaxy number densities remark¬ 
ably well. We note that different from the SHAM models, in 
the SCAM models, the number densities of the models are 
not required to exactly match those of the galaxy samples, 
and the discrepancies in the number densities contribute to 
the total ■ The models tend to find the balance between 
fitting the 2PCFs and fitting the sample number densities. 
However, the contribution of the number density to the total 
X^ is usually small, since a reasonable model that describes 
well the 2PCFs also predicts a reasonable sample number 
density. Even for the case with the largest deviation seen in 
the right panel of Fig. 7 (the Vpeak model for the sample of 
Mr < —19), its contribution to the total x^ is only 3.7%. 

Fig. 8 shows the mean occupation functions of the best¬ 
fitting HOD and SCAM models. The sharp cutoff profiles 
are shown for the faint galaxy samples. But we should note 
that the scatters between the galaxy luminosity and the 
halo properties are not well constrained in all models for 
faint galaxies (see also G15). The cutoff profiles in the Vacc 
and Fpeak models are softened because of the scatter be¬ 
tween the circular velocity and the halo mass (see also Fig. 5 
of Conroy et al. 2006). The trends in the mean occupation 
function with galaxy luminosity in different models are simi¬ 
lar. For the Mr < —21.5 sample, the mean occupation func¬ 
tions from the four models are closely matched, while the 
differences become larger for fainter galaxies. 

Fig. 9 presents the detailed comparisons of the three 
HOD parameters, the characteristic host halo mass Mmin, 
the characteristic mass of haloes hosting on average one 
satellite galaxy Mi and the satellite fraction /sat- For the 
purpose of fair comparisons, we convert the corresponding 
model parameters in the SCAM models to those of the HOD 
model using Eq. 6 and the corresponding version for t4cc and 
Vpeak- Except for the V^eak model, all the other three mod¬ 
els have consistent constraints to the host halo mass scale 
Mmin, because Mmin is mostly constrained by the sample 
number density and the large-scale galaxy bias. 

As seen in Fig. 1, the subhalo distribution profile in 
the host haloes is generally shallower than that of the dark 
matter distribution. The small-scale clustering is sensitive 
to the satellite occupation distribution, since it is dominated 
by the one-halo term, i.e. the galaxy pairs within the same 
host halo. In order to compensate the shallower profile and 
to match the small-scale clustering measurements of Wp, the 
SCAM models tend to populate satellite galaxies into lower 
mass haloes than in the HOD model. In the SCAM models, 
this is realized by lowering the mass (velocity) scale and in¬ 
creasing the scatter for populating subhaloes, compared to 
the way of populating distinct haloes. As a consequence, the 
characteristic mass Mi (left panel of Fig. 9) inferred from 
the SCAM models is generally smaller and the satellite frac¬ 
tion /sat (right panel of Fig. 9) is higher than that from the 
HOD model. The I4cc SCAM model shows the best overall 
agreement with the HOD model, with more or less consis¬ 
tent best-fitting x^ values (Fig. 7). The HOD-related param¬ 
eters of the four models have better agreement for luminous 
galaxies. However, the x^ values are still quite different from 


model to model (Fig. 7), indicating the effect and impor¬ 
tance of the spatial distribution of satellites (subhaloes or 
particles in the four models) in modelling small-scale Wp. For 
example, the model parameters of the three subhalo models 
for the Mr < —20.5 sample are consistent with each other, 
but the Mace model still has a x^/dof value as large as 4.2. 
Based on the best-fitting x^ values, the subhaloes selected 
by circular velocities (I4cc or Fpeak) seem to better trace the 
satellite galaxies (see also e.g. Chaves-Montero et al. 2015). 


5 MODELLING THE REDSHIFT-SPACE 

2PCFS 

As shown in G15, jointly fitting the projected and redshift- 
space 2PGFs helps tighten the constraints to the galaxy spa¬ 
tial distribution in the haloes, as well as constraining their 
velocity distributions. Since the traditional SHAM models 
do not have galaxy velocity bias that are required to fit 
the redshift-space 2PCFs, the resulting x^/dof values are 
found to be significantly large. We show in Fig. 10 the pre¬ 
dicted redshift-space monopole and quadrupole moments in 
the SHAM models that bestfit Wp. Clearly, the traditional 
SHAM models fail to describe the redshift-space clustering, 
especially the quadrupoles. Therefore, in this section, we 
only compare the HOD and SCAM model fitting results. 
We first display in Fig. 11 the predictions of the projected 
2PCF Wp{rp) for the bestfitting HOD and SCAM models 
from jointly fitting both the projected and redshift-space 
2PCFs. It is similar to Fig. 6, except that the Mace model 
leads to poorer fits for the faint galaxy samples, as a re¬ 
sult of tuning parameters to fit the redshift-space clustering. 
Fig. 12 shows the best fits to the redshift-space 2PCFs. For 
clarity, we only show the best-fitting models to the measured 
redshift-space monopole (circles) and quadrupole (squares) 
moments. The hexadecapole moments are also used in the 
model fittings, but not shown in the figure. The x^ of the 
best-fitting models are shown in the left panel of Fig. 13, 
while the right panel displays the best-fitting sample num¬ 
ber densities. 

Except for the Mace model, all other three models fit 
the data reasonably well. As seen from Fig. 12, the largest 
deviation of the Mace model fits from the measurements and 
from the fits of other models lies in the quadrupole, which 
dominates contributions to the x^ ■ Moreover, the best-fitting 
sample number densities from the Mace model are signifi¬ 
cantly lower than the observed ones for the faint galaxy sam¬ 
ples (except for the Mr < —18 sample). Compared to the 
constraints from fitting Wp only (Fig. 7), the Mace model has 
the galaxy number density decreased in the joint-fitting in 
order to match the redshift-space clustering. Since the Mace 
model provides very good fittings to Wp for the faint galaxies, 
the failure in matching the galaxy redshift-space clustering 
measurements indicates that the subhaloes selected based 
on Mace cannot reproduce well the velocity distribution of 
the satellite galaxies in the observation. 

Except for the sample of Mr < —20.5, the HOD model 
can explain the observed galaxy 2PCFs very well, with a rea¬ 
sonable x^/dof for each sample. As mentioned in the previ¬ 
ous section, the model fitting to the luminous galaxy samples 
(including Mr < —20.5) can be significantly improved when 
we allow the high-mass end slope a of the mean occupa- 
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Figure 10. Similar to Fig. 2, but for the redshift-space monopole (circles) and quadrupole (squares) moments predicted by the SHAM 
models that best fit Wp only. The measured and modelled monopole moments are shifted upwards by 30 for clarity. 


tion function to vary (see e.g. Table 2 of Guo et al. 2015c). 
Among the three SCAM models, the V^cc model better fits 
the data than the other two subhalo models, similar to the 
case of fitting Wp only. The dof of the models is 43 (48 2PCF 
data points plus one number density and minus six free pa¬ 
rameters), and the 2a range of the expected distribution 
is about 43±18.5. Even though the x^ values from the HOD 
model are overall lower than those from the subhalo models, 
those from the 14cc and Hpeak models are still within the 2cr 
range, giving reasonable fits to the data. 

Fig. 14 shows comparisons of the parameters of Mi, 
Mmin and /sat, as in Fig. 9. Similar to the results from fit¬ 
ting Wp only, differences in Mmin and Mi from different mod¬ 
els become larger for fainter galaxy samples. If we focus on 
comparing the HOD model and the 14cc and Fpeak subhalo 
models (that provide reasonable fits to the data), we find 
that the HOD model has the smallest Mmin and highest Mi 
values, and the lowest satellite fraction. The SCAM models 
tend to populate satellite galaxies into lower mass haloes to 
compensate their shallower spatial distribution in the host 
haloes. Compared to the right panel of Fig. 9, the uncertain¬ 
ties in /sat are greatly reduced, because the redshift-space 
clustering puts more constraints on the satellite galaxy dis¬ 
tributions. 

We show in Fig. 15 the model constraints to the galaxy 
velocity bias parameters for the different luminosity thresh¬ 
old samples. The black, green, blue and red curves are for 
the HOD, Mace, Hpeak and Face models, respectively. The 
solid and dashed lines are for the central (oc) and satel¬ 
lite (ofs) galaxy velocity bias parameters, respectively. The 
model constraints for the central galaxy velocity bias are 
generally consistent with each other. The best-fitting val¬ 
ues are much smaller than those in G15. The difference is 


caused by the different reference to define the velocity bias. 
In this paper, the reference halo velocity is defined as the av¬ 
erage particle velocities within inner 10% halo radius (core), 
while the velocity bias in G15 is with respect to the halo 
bulk velocity. There is a relative motion between the core 
and bulk of a halo (Behroozi et al. 2013). An average central 
galaxy velocity bias Qc ~ 0.1 is required to fit the redshift- 
space 2PCFs. 

For the satellite velocity bias as, the results from the 
HOD and the SCAM models cannot be directly compared. 
The satellite velocity bias as for the HOD model is defined 
with respect to the dark matter velocity dispersions within 
the haloes, i.e. as ,hod = cTsat/n^, while the satellite velocity 
bias in the SCAM models is with respect to the velocity dis¬ 
persions of the subhaloes in the host haloes, i.e. as .scam = 
fTsat/usub — (usat)/(usub) — as.HOo/asub- The sub¬ 
halo velocity bias Osub is measured to vary from 1.02 to 
1.11 in §3. We take a medium value of 1.07 for asub. So we 
can directly compare as ,hod and asubas.scAM. The value of 
as, HOD is around 0.8 for faint galaxies, and increases with 
luminosity for the two most luminous galaxy samples, con¬ 
sistent with the results of G15. But as ,hod is always smaller 
than as,SCAM (hence even smaller than asusas.scAM) in¬ 
ferred from the three SCAM models. There are also signif¬ 
icant differences in as .scam among the three SCAM mod¬ 
els, with the Mace model having the smallest as .scam and 
the Fpeak model having the largest. The results manifest 
that models with a shallower satellite spatial distribution 
need a compensation of having more satellites in lower mass 
haloes and a larger boost in velocity dispersion to match the 
redshift-space distortion, consistent with the test shown in 
Fig. 11 of Guo et al. (2015a). 

Satellites in the HOD model have the steepest spa- 
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Figure 11. Similar to Fig. 6, but for the bestfitting HOD and SCAM models of fitting both the projected and redshift-space 2PCFs. 


tial distribution profile. Sublialoes in the Mace model have 
steeper density profile than those in the other two subhalo 
models. We show in Fig. 16 three examples for the projected 
satellite galaxy number density profiles Esat(rp) as a func¬ 
tion of the projected distance Vp to centres of hosting haloes 
(see e.g. Chen et al. 2006; Wang et al. 2014). The projected 
number density is integrated over the same line-of-sight dis¬ 
tance as in the calculation of Wp(rp), i.e. 40/i“^Mpc. The 
turnover points in each sample roughly show the scale of 
the virial radii of the hosting haloes in these samples. The 
trend of the satellite density profiles is consistent with the 
behaviour of satellite velocity bias Qs in Fig. 15. Although 
the Mace model generally has a slope of the satellite galaxy 
density profile closer to the dark matter distribution, it does 
not necessarily lead to better fits to the galaxy 2PCF mea¬ 
surements. The difference in the different subhalo models is 
not only in the resulting subhalo density profiles, but also in 
the different hosting halo masses (left panel of Fig. 14). The 
difference in the satellite density profiles is partly compen¬ 
sated by the different satellite fraction /sat in each model. 
The V),eak model has the highest /sat in each galaxy sam¬ 
ple (right panel of Fig. 14) to compensate for its shallowest 
satellite distribution profiles. 

Since in our subhalo models we allow the central and 


satellite galaxies to have different relations with the hosting 
haloes (subhaloes), we can compare the model parameters 
for the central and satellite galaxies. Since the Mace model 
does not have a good best-fitting for each galaxy sample, 
we focus on the comparisons between the Face and Fpeak 
models. The left panel of Fig. 17 shows the comparisons of 
the circular velocity thresholds Fnin.cen and Fmin.sat for the 
Face (open circles with solid line) and Fpeak (filled circles 
with dashed line) models. It is clear that the assumption 
of the same galaxy-halo relation for central and satellite 
galaxies does not hold for the Fpeak model, where Fnin.sat 
is generally much larger than Fmin.cen- However, the Face 
model has almost the same circular velocities for central 
and satellite galaxies. The relation that Fmin.cen = Fmin.sat 
holds within errors for all the luminosity threshold samples. 

The right panel of Fig. 17 shows the scatter parameter 
(TiogVaco in th® ^acc model for distinct haloes and subhaloes 
(corresponding to central and satellite galaxies). In general, 
the scatters for the central and satellite galaxies are not 
equal to each other, with the satellite galaxies having larger 
scatters between the luminosity and Face- For the three lumi¬ 
nosity threshold samples around L*, i.e. Mr < —20, —20.5 
and —21, central and satellite galaxies have similar Fmin.acc 
and cTiogVj^^^. It implies that the SHAM model with scatter 
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Figure 12. Similar to Fig. 10, but for the HOD and SCAM models. The bestfitting models come from jointly fitting the projected 2PCF 
Wp and redshift-space 2PCF multiple moments ^o/ 2 / 4 - The measurements of the monopole moments are shifted upward by 10 for clarity. 
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Figure 13. Similar to Fig. 7, but for models joi 





fitting the projected and redshift-space 2PCFs. 


works well for these samples, which is consistent with the 
low values in the Vacc model of trip-only data in Fig. 4. 
But for other samples, central and satellite galaxies have 
different scatters in the luminosity-velocity relation, with 
satellites having larger scatters, which may be interpreted 
as resulted from the different evolution histories of the cen¬ 
tral and satellite galaxies. 

We note that the Fpeak model generally has a higher 
Fmin.sat than Vlnin.cen, Compared to the Face model. However, 
the Fpeak model has a higher satellite fraction /sat (right 
panel of Fig. 14), owing to a much larger satellite luminosity- 
velocity scatter (uiog ) than in the Face model. 

As a whole, when modelling redshift-space 2PCFs, we 


find that both the HOD and SCAM models can give reason¬ 
able fits to the measurements for luminous galaxy samples 
(above L*). For low luminosity galaxy samples (below L*), 
the HOD model, which use dark matter particles to rep¬ 
resent satellite galaxies, leads to the lowest among all 
the models. Among the subhalo models, if the best-fitting 
X^ values of low luminosity samples are compared, the Face 
model has the best performance. The Fpeak model is some¬ 
what worse, and the Mace model just fails to fit the data 
(except for the Mr < —18 sample). The results imply that 
the circular velocities Face and Fpeak are more correlated 
with satellite luminosity than Mace. 
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Figure 14. Similar to Fig. 9, but for the models jointly fitting the projected and redshift-space 2PCFs. 



Figure 15. Galaxy velocity bias probability distributions for different models, constrained from jointly fitting the projected and redshift- 
space 2PCFs. The solid and dashed lines are for the central (uc) and satellite (as) galaxy velocity bias, respectively. Different panels 
show the distributions for different luminosity threshold samples. The black, green, blue and red curves are for the HOD, Mace, fpeak 
and Face models, respectively. 


6 CONCLUSIONS AND DISCUSSIONS 

In this paper, we employ the HOD model and differ¬ 
ent SHAM models (and the extension, the SCAM mod¬ 
els) to model the projected and redshift-space 2PCF mea¬ 
surements for the different luminosity threshold samples in 
the SDSS DR7 Main galaxy sample. All the models are 
based on the high-resolution MDPL/SMDPL A-body simu¬ 
lations, using the accurate and efficient method developed in 
Zheng & Guo (2016). We explicitly compare the best-fitting 
values and the modelling results of the HOD model, the 
SHAM models, and the SCAM models. The HOD model 


uses dark matter particles in host haloes to represent satel¬ 
lite galaxies, while the three sets of SHAM/SCAM models 
use halo properties Mace, Uacc, and Upeak to establish the 
connection between haloes and galaxies, respectively. 

In the SHAM model, distinct haloes and subhaloes are 
treated in the same way when connected to galaxies. Even 
with the projected 2PCF Wp data alone, the SHAM model, 
no matter which halo property is used, generally fails to pro¬ 
vide satisfactory explanations to all the luminosity threshold 
samples, with a typical x^/dof > 2. We therefore introduce 
the SCAM model by allowing the relation between central 
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Figure 16. Projected number density profile for satellite galaxies 
from the four different bestfitting models. Offsets are added to 
separate the cases of different luminosity threshold samples for 
clarity. 


galaxies and distinct haloes and that between satellite galax¬ 
ies and subhaloes to be different, and determine the model 
parameters by jointly fitting the observed 2PCFs and the 
sample number density. The SCAM models give significantly 
better than the SHAM models. 

For an easy comparison, we choose parametrizations so 
that the HOD and SCAM models have the same dof. The 
main difference between the two models lies in the spatial 
distribution profile of satellites inside distinct haloes. Sub¬ 
haloes (satellite tracers in the SCAM models) generally have 
a shallower spatial distribution profile than dark matter par¬ 
ticles (satellite tracers assumed in our HOD model). The 
shallow distribution profile of subhaloes in A^-body simu¬ 
lations may be partially an effect of ignoring the baryon 
components — satellites traced by the more tightly bounded 
stellar component are less suffered from tidal disruption that 
destructs a fraction of subhaloes near the halo centre. This is 
supported by the comparisons of distributions of subhaloes 
and satellite galaxies in hydrodynamic and A-body simula¬ 
tions (e.g. Weinberg et al. 2008; Vogelsberger et al. 2014a), 
and additional investigations along such a direction can shed 
further light on such a phenomenon. In this paper, we work 
under the SHAM assumption that satellites are traced by 
subhaloes and investigate to what extent the subhalo mod¬ 
els can interpret the data and to study the corresponding 
implications. 

As expected, the differences in the modelling results be¬ 
tween the HOD and SCAM models and among the different 
SCAM models can be largely traced back to the differences 
in the spatial distribution profile of satellites. Compared to 
the HOD modelling results, the SCAM models tend to pop¬ 
ulate more satellites into lower mass host haloes to com¬ 
pensate the shallower subhalo distribution profile and hence 
to fit the small-scale clustering measurements. This leads 
to higher satellite fraction in the SCAM models. When fit¬ 
ting the redshift-space 2PCFs, we include the central and 


satellite galaxy velocity biases in all the models. The de¬ 
rived nonzero central galaxy velocity bias constraints of the 
SCAM models are consistent with the HOD model. The 
satellite galaxy velocity bias is higher in the SCAM models. 
The reason is as follows. As mentioned above, to match the 
small-scale (real-space) clustering, more satellites are pop¬ 
ulated into lower mass haloes in the SCAM models, and 
in these host haloes satellite moves more slowly than in the 
HOD model. The SCAM models therefore need to boost the 
velocities of satellites inside host haloes to fit the redshift- 
space distortion in the data, especially the Finger-of-God 
part. 

From jointly modelling the projected and redshift-space 
2PCFs, we find that the HOD model has an overall good 
performance. For luminous samples (above L*), all SCAM 
models provide good fits to the data, and the V),eak and Face 
models even work better than the HOD model in terms of 
(Fig. 13). However, for galaxy samples with threshold lu¬ 
minosity below Lt, the models become divided. The HOD 
model is superb, with the lowest x^ values. The Mace model 
fails to fit the data (except for the sample with the lowest 
luminosity threshold, Mr < —18). The 14cc and V),eak mod¬ 
els lead to x^ values higher than those from the HOD model, 
with the Vacc model being better. The x^ values from the 
two models are within the 2a range of the expected value. 
The results suggest that circular velocities (Vacc and Vpeak) 
are better quantities than mass Mace to connect to lumi¬ 
nosity of galaxies, especially satellites, even though Macc- 
selected subhaloes have the steepest spatial profile among 
the SCAM models. We therefore recommend that the SHAM 
model should no longer use Mace to link to galaxy luminos¬ 
ity. This is in line with the recent finding by Contreras et al. 
(2015), who investigate the SHAM performance for galax¬ 
ies in two different galaxy formation models and find that 
subhalo mass is not a good indicator of galaxy properties. 
For the two circular velocity SCAM models, the Face model 
is slightly better than the Fpeak model in reproducing the 
projected and redshift-space 2PCFs. In either model, differ¬ 
ent galaxy-halo relations for central and satellite galaxies 
(distinct haloes and subhaloes) are overall required by the 
data. 

The comparisons between the best-fitting x^ for the 
HOD and SCAM models show that the HOD model is gen¬ 
erally the best model to describe the galaxy distribution in 
both projected and redshift spaces. However, the Face and 
Fpeak models are still acceptable, especially to model lu¬ 
minous galaxy samples. Including other clustering statistics 
(e.g. the three-point correlation functions; Guo et al. 2015b) 
may help to further distinguish these models, as well as to 
tighten parameter constraints. 

It is worth noting that we adopt specific functional 
forms (Equations 2 and 5) to describe the occupation func¬ 
tions of central and satellite galaxies in the haloes for all 
the models considered in this paper. Such a functional form 
is motivated by the results in the semi-analytic models and 
hydrodynamic simulations of galaxy formation (Zheng et al. 
2005). It can be derived by assuming a lognormal distribu¬ 
tion of the central galaxy luminosity at fixed halo mass and 
a power-law relation between the mean luminosity of cen¬ 
tral galaxies and the host halo mass (Zheng et al. 2007). In 
the halo mass range where the luminosity-halo mass relation 
(LHMR) or SHMR deviates significantly from a power law, 
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Figure 17. Comparisons of the subhalo model parameters for the central and satellite galaxies from jointly fitting the projected and 
redshift-space 2PCFs. The left panel shows the comparisons of the circular velocity thresholds Vjnin.cen a-nd Vjnin.sat for the I4,cc (open 
circles with solid line) and Vpeak (filled circles with dashed line) models. The right panel shows the comparisons of the scatters 
and (Tiog for the Face model only. See text for details. 


the functional form is less accurate and the interpretation 
of parameters like Mmin becomes subtle. Leauthaud et al. 
(2011) compared the difference between the bestfitting HOD 
parameter Mmin (defined as (A^cen(fHinin)) = 0.5) with the 
SHMR of Behroozi et al. (2010) and that with a power law 
SHMR, and found that the difference in Mmin is < 20% for 
models with Mmin in the range of IO^^-IO^^Mq. For the rel¬ 
evant samples we model, the changes in log Mmin are 0.08, 
0.04, and -0.04 dex for Mr < —20.5, —21, and —21.5, repec- 
tively, all within the Icr model uncertainties. 

To derive the functional form of Equation 2, the scatter 
in central galaxy luminosity needs to be independent of halo 
mass and criog Mi, is connected to the luminosity scatter and 
the form of LHMR. In general, criog should not be inter¬ 
preted as the scatter of halo mass at fixed galaxy luminosity 
(Zheng et al. 2007; Leauthaud et al. 2011). Instead, it de¬ 
scribes the width of the cutoff profile of the central galaxy 
mean occupation function, as noted in Section 2.2. In mod¬ 
elling the data, the role of the cutoff profile is to convolve 
with halo mass function and halo bias factor to try to repro¬ 
duce the galaxy number density and the large-scale galaxy 
bias, and the two quantities are not sensitive to the func¬ 
tional form of the cuttoff profile (as long as the freedoms 
in width and mass scale are included). Therefore, while the 
interpretation of the parameters like criog Mi, can be subtle, 
the modelling results would not be affected much by the 
functional form. 

In the implementation of the HOD model, we make the 
assumption that satellite galaxies follow the spatial distribu¬ 
tion of the dark matter inside haloes. Although this assump¬ 
tion is commonly adopted in HOD modelling of galaxy clus¬ 
tering and is loosely motivated by theoretical studies (e.g. 
Nagai & Kravtsov 2005), it needs to be further tested. In 
hydrodynamic galaxy formation models, the spatial profile 
of satellite galaxies depends on the implementation details. 
For example, stellar mass loss can be different for satellites 
in models with galactic winds of different strengths (e.g. 
Simha et al. 2012), leading to differences in the spatial dis¬ 


tribution profile of satellites for a given stellar mass thresh¬ 
old (or galaxy number density). Given such uncertainties, 
in modelling galaxy clustering, one can introduce freedom 
in satellite spatial profile and galaxy formation models can 
help inform the sensible parametrization of such a profile. 

More generally, comparison of the spatial distributions 
of satellites, dark matter, and subhaloes in hydrodynamic 
and A^-body simulations can also help to evaluate the limi¬ 
tations of each model, to improve the prescriptions of each 
model, and to choose the best one to model the cluster¬ 
ing for a given sample of galaxies. The validity of the 
SHAM method can also be tested with such simulations. 
Simha et al. (2012) applied the SHAM model (with Mace as 
the halo/subhalo variable) to collisionless A^-body simula¬ 
tions and compared with the galaxies in corresponding hy¬ 
drodynamic simulations (with the same initial conditions). 
They find good agreement for the HODs and satellite dis¬ 
tribution profiles for galaxy samples defined by thresholds 
in stellar mass. They also find that SHAM slightly over¬ 
populates massive haloes and hence overpredicts the small- 
scale clustering, which is attributed to stellar mass loss of 
satellite galaxies. The trend seems to be opposite to our 
results, although the details depends on the implementa¬ 
tion in the strength of galactic winds. Chaves-Montero et al. 
(2015) also investigate the SHAM model with A^-body and 
the hydrodynamical simulation (the EAGLE simulation) for 
stellar mass threshold galaxy samples, using various circular 
velocities as the halo/subhalo variables. They found that the 
peak circular velocity of a subhalo after relaxation, which is 
a modified version of the V),eak used in our models, corre¬ 
lates most strongly with the galaxy stellar mass. The SHAM 
model using this parameter shows better agreement with the 
galaxy clustering measurements in the hydrodynamic simu¬ 
lations. Further investigations following the above ones will 
be useful (e.g. for luminosity-threshold samples). 

One basic assumption of the HOD model is that the 
statistical properties of the galaxy content in a halo only 
depend on the halo mass. Since the clustering of haloes 
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of the same mass depends on the halo assembly history 
(e.g. Gao et al. 2005; Wechsler et al. 2006; Zhu et al. 2006; 
Jing et al. 2007), the above assumption means that the halo 
assembly effect is not translated into galaxy properties in 
haloes of the same mass. If the galaxy assembly effect ex¬ 
ists (meaning that galaxy properties are correlated with 
halo assembly), it would possibly affect the HOD mod¬ 
elling (e.g. Zu et al. 2008; Zentner et al. 2014; Hearin et al. 
2015; Paranjape et al. 2015) and the current HOD frame¬ 
work would then need to be extended. However, there is 
no definite conclusion yet on whether the assembly bias in 
galaxy properties shows up in hydrodynamic simulations 
(e.g. Berlind et al. 2003; Chaves-Montero et al. 2015) or in 
galaxy clustering measurements (e.g. Lin et al. 2016). Ac¬ 
cording to the investigation by Chaves-Montero et al. (2015) 
with hydrodynamic simulations, modelling (with SHAM) 
based on certain circular velocity variable can capture about 
50% of the assembly bias effect in galaxy clustering. Since 
the SCAM models with circular velocity we introduce in 
this paper are still less successful than the HOD model, it 
remains to be seen whether the galaxy assembly effect is sig¬ 
nificant in real data. In any case, further studies on galaxy 
assembly are necessary and we reserve such investigations 
for future work. 
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